semantic kitti
Data-Efficient Point Cloud Semantic Segmentation Pipeline for Unimproved Roads
Yarovoi, Andrew, Valenta, Christopher R.
--In this case study, we present a data-efficient point cloud segmentation pipeline and training framework for robust segmentation of unimproved roads and seven other classes. Our method employs a two-stage training framework: first, a projection-based convolutional neural network is pre-trained on a mixture of public urban datasets and a small, curated in-domain dataset; then, a lightweight prediction head is fine-tuned exclusively on in-domain data. Along the way, we explore the application of Point Prompt Training to batch normalization layers and the effects of Manifold Mixup as a regularizer within our pipeline. We also explore the effects of incorporating histogram-normalized ambients to further boost performance. Using only 50 labeled point clouds from our target domain, we show that our proposed training approach improves mean Intersection-over-Union from 33.5% to 51.8% and the overall accuracy from 85.5% to 90.8%, when compared to naive training on the in-domain data. Crucially, our results demonstrate that pre-training across multiple datasets is key to improving generalization and enabling robust segmentation under limited in-domain supervision. Overall, this study demonstrates a practical framework for robust 3D semantic segmentation in challenging, low-data scenarios. Semantic segmentation of 3D point clouds is a foundational task for scene understanding, enabling a range of downstream applications such as autonomous route planning and infrastructure inspection. Despite significant progress in this field, most state-of-the-art segmentation models rely heavily on the availability of large, labeled training datasets. However, generating labeled point cloud data remains a substantial bottleneck: manual annotation is both labor-intensive and time-consuming, requiring over 30 minutes per scan on average in our experiments. This challenge makes it impractical to recreate large-scale datasets, commonly containing over 25,000 scans, for new or underrepresented environments.
PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness
Cao, Anh-Quan, Dai, Angela, de Charette, Raoul
We propose the task of Panoptic Scene Completion (PSC) which extends the recently popular Semantic Scene Completion (SSC) task with instance-level information to produce a richer understanding of the 3D scene. Our PSC proposal utilizes a hybrid mask-based technique on the non-empty voxels from sparse multi-scale completions. Whereas the SSC literature overlooks uncertainty which is critical for robotics applications, we instead propose an efficient ensembling to estimate both voxel-wise and instance-wise uncertainties along PSC. This is achieved by building on a multi-input multi-output (MIMO) strategy, while improving performance and yielding better uncertainty for little additional compute. Additionally, we introduce a technique to aggregate permutation-invariant mask predictions. Our experiments demonstrate that our method surpasses all baselines in both Panoptic Scene Completion and uncertainty estimation on three large-scale autonomous driving datasets. Our code and data are available at https://astra-vision.github.io/PaSCo .